# High-precision speech transcription

Whisper Medium Oswald
Apache-2.0
Haitian Creole speech recognition model fine-tuned based on OpenAI Whisper-medium, focusing on high-accuracy transcription
Speech Recognition Transformers Other
W
jsbeaudry
102
1
Distil Whisper Large V3 Ptbr
MIT
This is a fine-tuned version of distil-whisper-large-v3, specifically designed for automatic speech recognition (ASR) of Brazilian Portuguese, trained by combining the Common Voice 16 dataset and a private dataset.
Speech Recognition Safetensors
D
freds0
580
5
Reverb Asr
Other
Rev's Reverb ASR model is trained on 200,000 hours of professionally transcribed English speech data, making it one of the most accurate open-source automatic speech recognition systems for English.
Speech Recognition English
R
Revai
17
84
Whisper Medium Pt
Apache-2.0
Portuguese-optimized Whisper Medium speech recognition model achieving 6.579 word error rate (WER) on Common Voice 11 dataset
Speech Recognition Transformers Other
W
jlondonobo
85
15
Exp W2v2t It Wavlm S895
Apache-2.0
An Italian automatic speech recognition model fine-tuned based on microsoft/wavlm-large, trained using the Common Voice 7.0 Italian dataset.
Speech Recognition Transformers Other
E
jonatasgrosman
42
0
Ai Light Dance Singing2 Ft Wav2vec2 Large Xlsr 53 5gram V3
An automatic speech recognition model fine-tuned based on wav2vec2-large-xlsr-53, specializing in singing voice recognition
Speech Recognition Transformers
A
gary109
97
0
Ai Light Dance Singing2 Ft Wav2vec2 Large Xlsr 53 5gram V4 1
This model is an automatic speech recognition (ASR) model based on the wav2vec2-large-xlsr-53 architecture, fine-tuned on the GARY109/AI_LIGHT_DANCE - ONSET-SINGING2 dataset, primarily used for singing voice recognition tasks.
Speech Recognition Transformers
A
gary109
66
1
Ai Light Dance Stepmania Ft Wav2vec2 Large Xlsr 53 V5
Apache-2.0
Automatic speech recognition model based on wav2vec2-large-xlsr-53, fine-tuned on the GARY109/AI_LIGHT_DANCE dataset
Speech Recognition Transformers
A
gary109
160
0
Ai Light Dance Singing Ft Pretrain Wav2vec2 Large Lv60
This model is an automatic speech recognition (ASR) model based on the wav2vec2-large-lv60 architecture, fine-tuned on the GARY109/AI_LIGHT_DANCE - ONSET-SINGING dataset, primarily used for singing voice recognition tasks.
Speech Recognition Transformers
A
gary109
22
0
Wav2vec2 2 Bart Large No Adapter
This model is an automatic speech recognition (ASR) model trained on the LibriSpeech ASR dataset, capable of converting English speech into text.
Speech Recognition Transformers
W
sanchit-gandhi
22
0
Wav2vec2 2 Bert Large No Adapter
An automatic speech recognition (ASR) model trained on the LibriSpeech dataset for converting English speech to text
Speech Recognition Transformers
W
speech-seq2seq
15
1
Wav2vec2 2 Bert Large No Adapter Frozen Enc
This model is a speech recognition model trained on the librispeech_asr dataset, achieving a word error rate (WER) of 2.0133 on the evaluation set.
Speech Recognition Transformers
W
speech-seq2seq
25
2
Wav2vec2 Large Xlsr 53 French
Apache-2.0
This is an automatic speech recognition (ASR) model based on the wav2vec2 architecture, specifically fine-tuned for French, achieving a word error rate (WER) of 12.82% on the Common Voice French test set.
Speech Recognition Transformers French
W
Ilyes
31
4
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase